4. Flow Control Statements¶

This is the second lesson in the python for chemists crash course. In this lesson, we will learn how to control the order in which a program is run. You can watch the video for this lesson here or read the transcript below.

By default a python program is executed from the first to the last line, without repeating or skipping any lines. However, this of course is a rather rigid approach. Therefore, we can influence the control flow using so called flow control statements. We will get to know them below:

4.1. `if` `elif` `else`¶

if elif else are used for conditional execution of a code block. The syntax is a follows:

if condition:
    # block of code
    # block of code
    # block of code
# normal flow

The indented block of code is only executed, if the condition evaluates to True, otherwise it is skipped. By convention, four spaces are used for indentation.

The condition can be any kind of python expression. It is evaluated and converted to a boolean. If the result is True, then the block of code is executed. If the result is False it is skipped.

text = "This text is 32 characters long."

if len(text)>30:
    print("The text is longer than 30 characters.")

The text is longer than 30 characters.

The length of the text is longer than 30 characters, therefore the print command is executed.

if condition:
    # block of code
    # block of code
    # block of code
else:
    # else block
    # else block
# normal flow

if can be followed by an else statement. If the expression following if evaluates to True, then all elements of the indented block are executed. The optional else block is only executed, if that expression evaluates to False. The last statement in the code above, that is indented at the same level as if is executed in either case and the normal program flow continues.

# this line requests input from the user
reply = input("Is python the best language ever? ") 

if reply == "yes" or reply=="absolutely":
    print("Correct")
else:
    print("You are wrong.")
    
print("Python is the best language ever.")

When the user inputs either one of the expected strings "yes" or "absolutely" then the first print statement is executed. Otherwise it is skipped. In the latter case, the second print statement "You are wrong." is executed. The last print statement containing the affirmation that python is the best language ever is run in either case.

An if statement can also be followed by one or multiple elifs.

if condition:
    # block of code
elif condition1:
    # elif block 1
elif condition2:
    # elif block 2
else:
    # else block
# normal flow

The code inside the an elif block is only run, if it evaluates to True and neither the ifs nor any of the elifs before it have.

# ask user for input of day
day = input("What weekday is today? ")

# figure out number of days to the weekend
if day == "Monday":
    print("4 more days to the weekend")
elif day == "Tuesday":
    print("3 more days to the weekend")
elif day == "Wednesday":
    print("2 more days to the weekend")
elif day == "Thursday":
    print("1 more day to the weekend")
elif day == "Friday":
    print("The weekend starts tomorrow")
elif day=="Saturday" or day == "Sunday":
    print("Weekend!!!!")
else:
    print("{} is not a recognized day.".format(day))

First, we ask the user to input the current weekday. Then we compare the input with each possible day. If the if or any of the elif expressions evaluate to True then we print the number of days to the weekend. If none of them evaluate to true, the user must have given a wrong input. We take care of that in the else block.

Note: In cases where we have to select a single element from a number of choices, like here, creating a dictionary and then indexing it might be a good alternative.

4.2. `for` loops¶

for <variable> in <iterable>:
    #do stuff

for loops repeat the same code multiple times, assigning the value of one element of an iterable to a variable on every iteration.

The iterable is any python object that contains multiple elements, such as a tuple, a list, a dictionary or a string.

And it works like this:

for word in ["this", "is", "one", "word"]:
    print(word)
print("done!")

this
is
one
word
done!

For each iteration of the for loop another element from the list is assigned to the variable word. It is then printed. When the list is exhausted, the control flow continues outside for loop.

The range function is a very useful built-in python function, that can be used to create an iterable of integers. If only a single argument stop is given, then range starts at 0 and stops at stop - 1. It is also possible to pass a start argument and a step argument.

range(stop) --> 0, 1, 2, ... stop - 1
range(start, stop) --> start, start + 1, ... stop -1
range(start, stop, step) --> start, start + step, ... stop -1

Here, we create a list of squares of all elements between 0 and 10. In the first iteration, 0 is assigned to number. The square of number - 0 - is appended to the list list_of_squares. In the next iteration, one is assigned to number, squared and then appended to the list, and so on until we reach the last iteration where 10 is assigned to number.

list_of_squares = []

for number in range(10):
    list_of_squares.append(number**2)
list_of_squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

If you want to break out of a for loop before the iterable is exhausted, you can use break.

For example, this code assigns one element from the string to letter. We check, if the letter equals " ". If it is not, then we print the letter, if it is, then the break statement is reached and the loop stops. Hence, the loop only prints the characters from t,h,i,s before it reached " " and stops.

for letter in "this is a sentence":
    if letter == " ":
        break
    else:
        print(letter)

t
h
i
s

Let’s look at another version of this script. This asks for user input. If the string contains a full stop character, it will output the position of the first full stop, otherwise it will remind the user to input a sentence containing .. This code also shows an use for the in operator to check membership of "." in a string.

First, we use the input function to ask the user for a string input. Then, we use not in in the if block to make sure that the string actually contains a “.”. If if doesn’t there really isn’t a point to continue. Hence, only if a the string contains a “.” we enter the else block and start looping over the characters in the string.

The function we use on the right side of the for ... in statement - enumerate - enumerate is a built-in python function that takes elements out of an iterable (in this case the string) one by one and returns a tuple for each of them. The first element of the tuple is the count of the element, the second one is the element itself.

For example enumerate( "abc") would return (0,"a"), (1,"b") and (2, "c").

The for loop also contains a second new type of python syntax: “unpacking”. If you have a list or a tuple of the right side of an assignment, you can assign each element of the list/tuple to a different variable, by putting multiple variables on the left side of the assignment, separated by a ,. (The number of variables on the left side needs to equal to the number of elements in the list/tuple. In the case of the code below, we assign the first element of the tuple returned by enumerate to idx and the second one to character.

Inside the for loop, we check if the current character equals "." using if. In case that is true, we print the current value of idx and stop the iteration.

sentence = input("Please input a sentence here: ")

if "." not in sentence:
    print ("This sentence does not contain a '.'")
else:
    for idx, character in enumerate(sentence):
        if character == ".":
            print("The full stop is at position {}".format(idx))
            break

Please input a sentence here: This is a test sentence that has one. Also, another one.
The full stop is at position 36

Note

Try modifying the code above to print the indices of all “.” in the string.

Which line would you need to change?

What would you need to do, if you wanted to have a list of all indices of “.”.

Dictionaries are also iterable. However, there are two caveats. First, only the keys of the dictionary are assigned to the variable. Second, the order of keys is not guaranteed to be the same every time the loop is rerun.

mydict has keys 1, "two" and "three". Each one of them is assigned to k once, when the loop is run:

mydict = {1:"hello", "two":"world", "three":"!"}

for k in mydict:
    print(k)

1
two
three

We can use indexing to access the values corresponding to each key inside the loop, like so:

for k in mydict:
    print(k, mydict[k])

1 hello
two world
three !

We can also use the dictionary’s .items() method to get tuples of keys and values. The comma between the variables k and v unpacks the zeroth and first elements of the tuple into the separate variables, the key ends up in k again, and the value ends up in v:

for k,v in mydict.items():
    print(k,v)

1 hello
two world
three !

4.3. functions¶

Functions are a key part of programs. They are a series of commands that - once defined - can be executed repeatedly at later parts of the program. Functions accept inputs and return outputs.

A python function is defined by using the defkey word. def is followed by the name of the function - rules for these names are the same as for variables- then brackets and a list of arguments. The indented code block inside a function is run whenever the function is called, with the arguments assigned to variables of the same name.

Finally, the return statement is used to return data from the function. In your code, when you call the function, it is replaced with the content of the return statement.

A function can also include if and for blocks.

def function_name(arg1, arg2, arg3,...):
    #codeblock
    return #return values, optional

To execute a function, call it by typing in the function name followed by braces. If the function has required arguments, these need to be written inside the braces as well.

Here, we define the function greet, which has a single required argument name. When we call the function, it takes the argument, appends it to the string "Hello, " and prints the result. There is no return statement here - but that’s OK. return is optional.

The string following the first line of the function definition is called the docstring. It is also optional. We can use it to describe what the function does and how to use it.

def greet(name): 
    """this functions takes a single required argument: `name`"""
    print("Hello, "+name)

greet("Georg")

Hello, Georg

When multiple arguments are used, they can either be entered positional or with a key word.

Arguments given without a keyword are called “positional arguments”. The values passed are assigned to the arguments in the same order as in the function definition. Here, the first argument "Georg" is assigned to the name argument, and the second argument "Guten Tag", is assigned to the greeting argument.

In the second call we are using “keyword arguments”. That means, we use the name of the argument and an equal sign to assign arguments in the function call. In this case the order of arguments does not matter.

def greet(name, greeting):
    """this functions takes two required arguments: 
    name` and `greeting`"""
    print(greeting +", " + name +"!")

greet("Georg", "Guten Tag") #positional

greet(greeting="Servus", name="Georg") #keyword

Guten Tag, Georg!
Servus, Georg!

Positional arguments and keyword arguments can be mixed. It is however not allowed to use a positional argument after the first keyword argument has been used in a function call.

When defining a function we can also set up optional arguments by adding an argument with a default value assigned using the = sign. Optional arguments don’t have to be passed when a function is called. If they are omitted during the call, the default value set when the function was defined is used instead.

Below, the default value for the greeting argument is "Hello". In the first call, we use the required argument name. Hence, the default argument is used for greeting, "Hello". In the second call, we also pass the second argument (as a positional argument). This time, the greeting used is "Goodbye".

def greet(name, greeting="Hello"):
    """ this functions takes two arguments: 
    `name` and `greeting`
    `greeting` is optional""" 
    print(greeting +", " + name +"!")
    
greet("Toby")
greet("Toby", "Goodbye")

Hello, Toby!
Goodbye, Toby!

Warning

Default arguments are only read when the function is defined. This can lead to “weird” behavior for mutable data types

Here is an example of how this can go wrong. We define a function that appends a name to list of names. Because we might not already have such a list, we make the name_list an optional argument, with an empty list as a default (a terrible idea, as we will see in a second). First, if we pass both arguments, everything works as expected. The additional name "Antoni" is appended to th list we already passed and the updated name list is returned.

def append_name(name, name_list=[]):
    name_list.append(name)
    return name_list

#normal behaviour
name_list = ["Antoni", "Tan", "Karamo", "Bobby"]
updated_name_list = append_name("Jonathan", name_list)
updated_name_list

['Antoni', 'Tan', 'Karamo', 'Bobby', 'Jonathan']

However, if we don’t pass a list as an argument, everything works correctly for the first time we call the function.

append_name("Jeremy")

['Jeremy']

However, when we call the function again, we get unexpected behavior. Instead of just returning a list with a single name, we get two elements:

append_name("Beremy")

['Jeremy', 'Beremy']

What is happening here? The list in the default argument is assigned when the function is defined for the first time. Hence, every time we call the function without a default argument, another element is appended to that list.

If you need a mutable default argument, the following construct is typically used:

def append_name(name, name_list=None):
    if name_list is None:
        name_list = [] #new list for every function call
    name_list.append(name)
    return name_list

In this case, whenever no value is passed for name_list, then a new list is created.

4.3.1. `return`¶

return is used to return values from a function to the calling program. All values to the right of the return keyword are returned to the program that called the function. Multiple values can be returned as a tuple.

The function add_1 that we are defining below adds 1 whatever we pass and returns the value. Hence, when we call the function and use it on the right side of an assignment to the variable value, value ends up being 1 + 1, 2.

def add_1(number): 
    """
    adds 1 to number
    """
    new_number = number + 1
    return new_number

value = add_1(1)
value

Functions without a return nevertheless return a value: None. None is used to denote “here is no value at all”. IN this case, the fact that "bla" is printed, shows that the function has been run. We again assign the output to the variable value, which this time ends up containing None.

def this_has_no_return():
    print("bla")
value = this_has_no_return()
print(value)

bla
None

4.3.2. docstrings¶

It is considered good coding style to add docstrings to all your functions. Even if you know what they do (right now). Future you will be very grateful.

Docstrings are added enclosed by """...""" right after the def line. The default formatting of a docstring should start out with a single line of summary. If it is a more complex function, then a more in depth description should follow. First the parameters, then the returned values and finally, a full description of the function. The numpy and scipy projects have very thorough documentation, that even gives usage examples and discussions on the effect of different parameters. See for example to docstring for the scipy.signal.savgol_filter function:

from scipy.signal import savgol_filter

help(savgol_filter)

Help on function savgol_filter in module scipy.signal._savitzky_golay:

savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0, axis=-1, mode='interp', cval=0.0)
    Apply a Savitzky-Golay filter to an array.
    
    This is a 1-D filter. If `x`  has dimension greater than 1, `axis`
    determines the axis along which the filter is applied.
    
    Parameters
    ----------
    x : array_like
        The data to be filtered. If `x` is not a single or double precision
        floating point array, it will be converted to type ``numpy.float64``
        before filtering.
    window_length : int
        The length of the filter window (i.e., the number of coefficients).
        `window_length` must be a positive odd integer. If `mode` is 'interp',
        `window_length` must be less than or equal to the size of `x`.
    polyorder : int
        The order of the polynomial used to fit the samples.
        `polyorder` must be less than `window_length`.
    deriv : int, optional
        The order of the derivative to compute. This must be a
        nonnegative integer. The default is 0, which means to filter
        the data without differentiating.
    delta : float, optional
        The spacing of the samples to which the filter will be applied.
        This is only used if deriv > 0. Default is 1.0.
    axis : int, optional
        The axis of the array `x` along which the filter is to be applied.
        Default is -1.
    mode : str, optional
        Must be 'mirror', 'constant', 'nearest', 'wrap' or 'interp'. This
        determines the type of extension to use for the padded signal to
        which the filter is applied.  When `mode` is 'constant', the padding
        value is given by `cval`.  See the Notes for more details on 'mirror',
        'constant', 'wrap', and 'nearest'.
        When the 'interp' mode is selected (the default), no extension
        is used.  Instead, a degree `polyorder` polynomial is fit to the
        last `window_length` values of the edges, and this polynomial is
        used to evaluate the last `window_length // 2` output values.
    cval : scalar, optional
        Value to fill past the edges of the input if `mode` is 'constant'.
        Default is 0.0.
    
    Returns
    -------
    y : ndarray, same shape as `x`
        The filtered data.
    
    See Also
    --------
    savgol_coeffs
    
    Notes
    -----
    Details on the `mode` options:
    
        'mirror':
            Repeats the values at the edges in reverse order. The value
            closest to the edge is not included.
        'nearest':
            The extension contains the nearest input value.
        'constant':
            The extension contains the value given by the `cval` argument.
        'wrap':
            The extension contains the values from the other end of the array.
    
    For example, if the input is [1, 2, 3, 4, 5, 6, 7, 8], and
    `window_length` is 7, the following shows the extended data for
    the various `mode` options (assuming `cval` is 0)::
    
        mode       |   Ext   |         Input          |   Ext
        -----------+---------+------------------------+---------
        'mirror'   | 4  3  2 | 1  2  3  4  5  6  7  8 | 7  6  5
        'nearest'  | 1  1  1 | 1  2  3  4  5  6  7  8 | 8  8  8
        'constant' | 0  0  0 | 1  2  3  4  5  6  7  8 | 0  0  0
        'wrap'     | 6  7  8 | 1  2  3  4  5  6  7  8 | 1  2  3
    
    .. versionadded:: 0.14.0
    
    Examples
    --------
    >>> from scipy.signal import savgol_filter
    >>> np.set_printoptions(precision=2)  # For compact display.
    >>> x = np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])
    
    Filter with a window length of 5 and a degree 2 polynomial.  Use
    the defaults for all other parameters.
    
    >>> savgol_filter(x, 5, 2)
    array([1.66, 3.17, 3.54, 2.86, 0.66, 0.17, 1.  , 4.  , 9.  ])
    
    Note that the last five values in x are samples of a parabola, so
    when mode='interp' (the default) is used with polyorder=2, the last
    three values are unchanged. Compare that to, for example,
    `mode='nearest'`:
    
    >>> savgol_filter(x, 5, 2, mode='nearest')
    array([1.74, 3.03, 3.54, 2.86, 0.66, 0.17, 1.  , 4.6 , 7.97])

4.4. variable scope¶

The scope of a variable refers to its visibility: which parts of your program can read or write the variable.

Python handles variables in so called “namespaces”. These are dictionaries that map variable names to objects. When your program reads a variable, the python interpreter will first look up the variable in the namespace it is currently in. Thus, if the interpreter is in a function call, it will first look in the namespace of the function. If it can’t find the definition there, it will go to the current namespaces parent namespace continuing upwards. If the function was defined inside a module, it will stop at the module namespace, otherwise, it will end in the __main__ module namespace.

If the interpreter can’t find the variable anywhere, it will raise an exception.

In the following example, python first looks for a inside the function read_a. Since it can’t find it there, it moves up one namespace, where it finds a

def read_a():
    print(a)
a = "This is 'a'"
read_a()

This is 'a'

If a is redefined, read_a will see its new value and will print that out instead.

a = "No, this is Patrick"

read_a()

No, this is Patrick

When assigning a value to a variable, python always uses the local namespace. In the function store_a we assign the argument that was passed to the function to the variable a. Since python starts to search for varibles from the innermost scope outward, it uses the value that was assigned the function in the call to print. Hence, when we pass "Patrick", that is what is printed from inside the function. However, the value of a outside the function doesn’t change from the assignment inside the function, it remains "Spongebob".

def store_a(value):
    a = value
    print("Inside function:", a)

a = "Spongebob"
store_a("Patrick")

print("Outside function:", a)

Inside function: Patrick
Outside function: Spongebob

4.5. Summary¶

1.if elif else

if <expression 1>:
    # this part is executed only if <expression 1> == true
elif <expression 2>:
    # this part is executed only if <expression 1> == false and <expression 2> == true
else:
    # this part is executed only if <expression 1> == false and <expression 2> == false

2.for

for <var> in <iterable>:
    # this block is repeated until all values of <iterable> have been used up
    # each time <var> takes the value of the next item in <iterable>

Stop using break. Skip ahead using continue.

3. functions

def function_name(arg1, arg2, arg3,..., argx=<default value>,...):
    """docstring"""
    #codeblock
    return #return values, optional

Python for Chemists